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1 Introduction 

The understanding of molecular cell biology requires insight into the struc- 
ture and dynamics of networks that are made up of thousands of interacting 
molecules of DNA, RNA, proteins, metabolites, and other components. One 
of the central goals of systems biology is the unraveling of the as yet poorly 
characterized complex web of interactions among these components. This work 
is made harder by the fact that new species and interactions are continuously 
discovered in experimental work, necessitating the development of adaptive and 
fast algorithms for network construction and updating. Thus, the "reverse- 
engineering" of networks from data has emerged as one of the central concern 
of systems biology research. 

A variety of reverse-engineering methods have been developed, based on tools 
from statistics, machine learning, and other mathematical domains. In order 
to effectively use these methods, it is essential to develop an understanding of 
the fundamental characteristics of these algorithms. With that in mind, this 
chapter is dedicated to the reverse-engineering of biological systems. 
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Specifically, we focus our attention on a particular class of methods for 
reverse-engineering, namely those that rely algorithmically upon the so-called 
"hitting-set" problem, which is a classical combinatorial and computer science 
problem. Each of these methods utilizes a different algorithm in order to obtain 
an exact or an approximate solution of the hitting set problem. We will explore 
the ultimate impact that the alternative algorithms have on the inference of 
published in silico biological networks. 

2 Reverse Engineering of Biological Networks 

Systems biology aims at a systems-level understanding of biology, viewing or- 
ganisms as integrated and interacting networks of genes, proteins, and other 
molecular species through biochemical reactions that result in particular form 
and function (phenotype). Under this "system" conceptualization, it is the 
interactions among components that gives rise to emerging properties. 

Systems-level ideas have been a recurrent theme in biology for several decades, 
as exemplified by Cannon's work on homeostasis [7], Wiener's biological cyber- 
netics [33], and Ludwig von BertalanfFy's foundations of general systems the- 
ory [32] . So what has brought systems biology to the mainstream of biological 
science research in recent years? The answer can be found in large part in 
enabling technological advances, ranging from high-throughput biotechnology 
(gene expression arrays, mass spectrometers, etc.) to advances in information 
technology, that have revolutionized the way that biological knowledge is stored, 
retrieved and processed. 

A systems approach to understanding biology can be described as an itera- 
tive process which includes: (1) data collection and integration of all available 
information (ideally, regarding all the components and their relationships in 
the organism of interest), (2) system modeling, (3) experimentation at a global 
level, and (4) generation of new hypotheses (see Fig. 2.1). 

The current chapter focuses on the system modeling aspects, and, specif- 
ically, on the top-down modeling approach broadly known as the biological 
"reverse-engineering" , which can be very broadly described as follows: 

The biological reverse engineering problem is that of analyzing a 
given system in order to identify, from biological data, the compo- 
nents of the system and their relationships. 

In broad terms, there are two very different levels of representation for bio- 
logical networks. They are described as follows. 

(a) Network Topology Representations 

Also known as "wiring diagrams" or "static graphs", these arc coarse dia- 
grams or maps that represent the connections (physical, chemical, or statistical) 
among the various molecular components of a network. At this level, no de- 
tailed kinetic information is included. A network of molecular interactions can 
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Figure 2.1: Iterative Process in Systems Biology. 



be viewed as a graph: cellular components are nodes in a network, and the in- 
teractions (binding, activation, inhibition, etc.) between these components are 
the edges that connect the nodes. A reconstruction of network topology allows 
one to understand properties that might remain hidden without the model or 
with a less relevant model. 

These type of models can be enriched by adding information on nodes or 
edges. For instance, '+' or ' — ' labels on edges may be used in order to indicate 
positive or negative regulatory influences. The existence of an edge might be 
specified as being conditional on the object being studied (for instance a cell) 
being in a specific global state, or on a particular gene that regulates that par- 
ticular interaction being expressed above a given threshold. These latter types 
of additional information, however, refer implicitly to notions of state and tem- 
poral evolution, and thus lead naturally towards qualitative dynamical models. 

Different reverse-engineering methods for topology identification differ on the 
types of graphs considered. For example, in the work in [3,9,11,24,26,28,34, 
35], edges represent statistical correlation between variables. In [10, 13, 15, 17], 
edges represent causal relationships among nodes. 

(b) Network Dynamical Models 

Dynamical models represent the time- varying behavior of the different molec- 
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ular components in the network, and thus provide a more accurate representa- 
tion of biological function. 

Models can be used to simulate the biological system under study. Different 
choices of values for parameters correspond either to unknown system charac- 
teristics or to environmental conditions. The comparison of simulated dynamics 
with experimental measurements helps refine the model and provide insight on 
qualitative properties of behavior, such as the identification of steady states 
or limit cycles, multi-stable {e.g., switch-like) behavior, the characterization of 
the role of various parts of the network in terms of signal processing (such as 
amplifiers, differentiators and integrators, logic gates), and the assessment of 
robustness to environmental changes or genetic perturbations. 

Examples of this type of inference include those leading to various types of 
Boolean networks [2,20-22] or systems of differential equations [12,16,30], as 
well as multi-state discrete models [19]. 

Depending upon the type of network analyzed, data availability and quality, 
network size, and so forth, the different reverse engineering methods offer dif- 
ferent advantages and disadvantages relative to each other. In Section 3.1, we 
will explore some of the common approaches to their systematic evaluation and 
comparison. 

2.1 Evaluation of the Performance of Reverse Engineering 
Methods 

The reverse-engineering problem is by its very nature highly "ill-posed" , in the 
sense that solutions will be far from unique. This lack of uniqueness stems from 
the many sources of uncertainty: measurement error, lack of knowledge of all 
the molecular species that are involved in the behavior being analyzed ( "hidden 
variables"), stochasticity of molecular processes, and so forth. In that sense, 
reverse-engineering methods can at best provide approximate solutions for the 
network that one wishes to reconstruct, making it very difficult to evaluate their 
performance through a theoretical study. Instead, their performance is usually 
assessed empirically, in the following two ways: 

Experimental testing of predictions: after a model has been inferred, the 
newly found interactions or predictions can be tested experimentally for 
network topology and network dynamics inference, respectively. 

Benchmarking testing: this type of performance evaluation consists on mea- 
suring how " close" the method of our interest is from recovering a known 
network, referred to as the "gold standard" for the problem. In the case 
of dynamical models, one evaluates the ability of the method of interest to 
reproduce observations that were not taken into account in the "training" 
phase involved in the construction of the model. On the another hand, for 
methods that only reconstruct the network topology (wiring diagram), a 
varicrty of standard metrics may be applied. 
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Metrics for Network Topology Benchmarking 

Suppose that F is the graph representing the network topology of a chosen 
"gold standard" network. Let be the graph representing the inferred network 
topology. Each one of the interactions in F^ can be classified into one of the 
these four classes, when comparing to the gold standard: 

(a) Correct interactions inferred (true positives, TP) 

(b) Incorrect interactions inferred (false positives, FP) 

(c) Correct non-interactions inferred (true negatives, TN) 

(d) Incorrect non-interactions inferred (false negatives FN) 

From this classification of the interactions, we compute the following metrics: 

• The Recall or True Positive Rate TPR = TP /{TP + FN) 

• The False Positive Rate FPR = FP/{FP + TN). 

• The Accuracy ACC = {TP + TN) /TotI where TotI is the total number 
of possible interactions in a network. 

• The Precision or Positive Predictive Value PPV = TP/ {TP + FP). 

As mentioned earlier, the reverse-engineering problem is underconstrained. Ev- 
ery algorithm will have one or more free parameters that helps select a "best" 
possible prediction. Hence, a more objective evaluation of performance has to 
somehow involve a range of parameter values. One way to evaluate performance 
across ranges of parameters is the receiver operating characteristic (ROC) 
method, based on the plot of FPR vs. TPR values. The resulting ROC plot 
depicts relative trade-offs between true positive predictions and false positive 
prediction across different parameter values (See Fig. 2.2). A closely related 
approach is the Recall-Precision plot, obtained by plotting TPR vs. PPV 
values. 

3 Classical Combinatorial Algorithms: A Case 
Study 

We have briefly discussed some basic aspects of reverse-engineering of biological 
systems. Next, BjS 0, CcLSC of study, we focus our attention on some reverse- 
engineering algorithms that rely upon the solution of the so-called "Hitting Set 
Problem" . The Hitting Set Problem is a classical problem in combinatorics and 
computer science. It is defined as follows: 
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Figure 2.2: Receiver operating characteristic -ROC-space. Defined by FPR 
vs. TPR values in a two dimensional coordinate system: a perfect reverse 
engineering method will ideally have score (FPR, TPR) = (0,1) whereas the 
worst possible network will have coordinates (FPR, TPR) = (1,0) and scores 
below the identity line (diagonal) indicate methods that perform no better than 
a random guess. 



6 



Problem 1 (HITTING SET Problem) Given a collection of subsets of 
E = {1, . . . ,n}, find the smallest set L C E such that L Cl =^ 9 for all 
e ^. 

The Hitting Set problem is NP-hard, as can be shown via transformation 
from its dual, the (Minimum) Set Cover problem [14]. 

We next introduce some reverse engineering methods based on the hitting 
set approach. 

• Ideker et al. [15]. 

This paper introduces two methods to infer the topology of a gene reg- 
ulatory network from gene expression measurements. The first "network 
inference" step consists of the estimation of a set of Boolean networks 
consistent with an observed set of steady-state gene expression profiles, 
each generated from a different perturbation to the genetic network stud- 
ied. Next, an "optimization step" involves the use of an entropy-based 
approach to select an additional perturbation experiment in order to per- 
form a model selection from the set of predicted Boolean networks. In 
order to compute the sparsest network that interpolates the data, Ideker 
et al. rely upon the "Minimum Set Cover" problem. An approximate 
solution for the Hitting Set problem is obtained by means of a branch and 
bound technique [25]. Assessment is performed "m Numero^^: the pro- 
posed method is evaluated on simulated networks with varying number of 
genes and numbers of interactions per gene. 

• Jarrah et al. [13] 

This paper introduces a method for the inference of the network topology 
from gene expression data, from which one extracts state transition mea- 
surements of wild-type and perturbation data. The goal of this reverse- 
engineering algorithm is to output one or more most likely network topolo- 
gies for a collection xi, . . . , a;„ of molecular species (genes, proteins, etc), 
which we will refer to as variables. The state of a molecular species can 
represent its levels of activation. That is, each variable Xi takes values 
in the set X = {0, 1,2,.. .} and the interactions among species indicate 
causal relationships among molecular species. The inference algorithm 
takes as input one or more time courses of observational data. The out- 
put is a most likely network structure for the interactions among xi, . . . ,Xn 
that is consistent with the observational data: The notion of consistency 
with observational data makes the assumption that the regulatory net- 
work for be viewed as a dynamical system that is described 
by a function / : X„ — > X„, which transforms an input state (si, . . . , s„), 
Si e X, of the network into an output state (ii, . . . ,i„) at the next time 
step. A directed edge Xi — > Xj in the graph of the network topology of 
this dynamical system / indicates that the value of Xj under application 
of / depends on the value of Xi. Hence a directed graph is consistent with 
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a given time course si, . . . , of states in X„, if it is the network topol- 
ogy of a function / : X„ Xn that reproduces the time course, that is, 
/(si) Si+i for all i- 

One possible drawback of reverse engineering approaches lies in the fact 
that they construct the "sparest" possible network consistent with the 
given data. However, real biological networks are known to be not mini- 
mal [31]. Although accurate measures of deviation from sparsity are diffi- 
cult to estimate, nonetheless it seems reasonable to allow additional edges 
in the network in a "controlled" manner that is consistent with the given 
data. As already commented in [15], it is possible to add redundancies to 
the reverse engineering construction. The basic hitting set approach pro- 
vides only a minimal set of connections, whereas real biological networks 
are known to contain redundancies {e.g., see [31]). To account for this, 
one can modify the hitting set approach to add redundancies systemati- 
cally by allowing additional parameters to control the extra connections. 
Theoretically, in terms of the algorithm this corresponds to a standard 
generalization of the set-cover problem, known as the set-multicover prob- 
lem, which is well-studied in the literature, and for which approximation 
algorithms are known [4]. 

The search for the topologies that interpolate the input data involves 
directly the Hitting Set problem, which is solved analytically with the use 
of a computational algebra tools. 

The algorithms presented in [5, 17] also make use of Hitting Set algorithms, but 
we will restrict our attention to the comparison of the two methods described 
above. 

3.1 Benchmarking RE Combinatorial-Based Methods 
3.1.1 In Silico Gene Regulatory Networks 

Wc use data from two different regulatory networks. These contain some fea- 
tures that are common in real regulatory networks, such as time delays and the 
need for a measurement data presented into discrete states (0, 1,2,.. .). 

In Silico Network 1: Gene Regulatory Network with External Per- 
turbations. This network was originally introduced in [6]. It was generated 
using the software package given in [23], the interactions between genes in this 
regulatory network are phenomenological, and represent the net effect of tran- 
scription, translation, and post-translation modifications on the regulation of 
the genes in the network. The model is implemented as a system of ODEs in 
Copasi [18]. 

This network, shown in Fig. 3.3, consists of 13 species: ten genes plus three 
different environmental perturbations. The perturbations affect the transcrip- 
tion rate of the gene on which they act directly (through inhibition or activation) 
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Gene Network With 10 Genes and 3 External Perturbations 




Figure 3.3: Network 1: 10 genes and 3 environmental perturbations. In this 
network, the 3 environmental perturbations PI, P2 and P3 directly affect the 
expression rate of genes Gl, G2 and G5, respectively. 



and their effect is propagated throughout the network by the interactions be- 
tween the genes. 

Network 2: Segment Polarity Genes Network in D. melanog aster . The 
network of segment polarity genes is responsible for pattern formation in the 
Drosophila melanogaster embryo. Albert and Othmer [1] proposed and analyzed 
a Boolean model based on the binary ON/OFF representation of mRNA and 
protein levels of five segment polarity genes. This model was constructed based 
on the known topology and it was validated using published gene and expression 
data. We generated time courses from this model, from which we will attempt 
to reverse-engineer the network in order to benchmark the performance of the 
reverse-engineering algorithms being evaluated. 

The network of the segment polarity genes represents the last step in the 
hierarchical cascade of gene families initiating the segmented body of the fruit 
fly. The genes of this network include engrailed (en), wingless {wg), hedge- 
hog {hh), patched (pte), cubitus interruptus (cz) and sloppy paired {sip), cod- 
ing for the corresponding proteins, which are represented by capital letters 
{EN,WG,HH,PTC,CI and SLP). Two additional proteins, resulting from 
transformations of the protein CI, also play important roles: CI may be con- 
verted into a transcriptional activator, CIA, or may be cleaved to form a tran- 
scriptional repressor CIR. The expression of the segment polarity genes occurs 
in stripes that encircle the embryo. These key features of these patterns can be 
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represented in one dimension by a line of 12 interconnected cells, grouped into 
3 parasegment primordia, in which the genes are expressed every fourth cell. In 
Albert and Othmer [1] , parasegments are assumed to be identical, and thus only 
one parasegment of four cells is considered. Therefore, in the model, the vari- 
ables are the expression levels of the segment polarity genes and proteins (listed 
above) in each of the four cells, and the network can be seen as a 15 x 4 = 60 
node network. Using the wild- type pattern from [1], wc consider one wild-type 
time series of length 23. 




Figure 3.4: Segment Polarity Genes Network on the D. melanogaster. This 
network consists of the interaction of 60 molecular species: genes and proteins. 



3.1.2 Results of Comparison 

In this section we compare the results obtained after running Jarrah et aZ.'s and 
Idekerei a/.'s methods on each of the above networks. Computations were made 
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Table 3.1; Comparison of RE methods 



on Mac OS X, Processor 2GHz Intel Core 2 Duo. 

As we mentioned in Section 3, for Jarrah et aZ.'s method, the input data must 
be discrete. Hence in order to apply this reverse-engineering method to network 
1 we discretize the input data, considering then different discretizations as our 
running parameter to test Jarrahei al.^s method in the ROC space. We specifi- 
cally use three discretization methods: a graph-theoretic based approached "D" 
(see [8]), as well as quantile "Q" (discretization method on which each variable 
state receives an equal number of data values) and interval "I" discretization 
(discretization method on which we select thresholds for the different discrete 
values). 

For Ideker et aZ.'s method we have considered both Greedy and Linear Pro- 
gramming approximations to the Hitting set problem as well as redundancy 
values (how many extra edges one allows) of i? = 1 or 2. 

We have displayed some our results on Table 3.1. We observe that for net- 
work 1, Jarrah et. aZ.'s method obtains better results than Ideker et. aZ.'s method 
when considering these values in the ROC space, although both fare very poorly. 
On the another hand, we observe that Ideker et. aZ.'s method achieves a per- 
formance no better than random guessing on this network. In contrast, for 
network 2, Jarrah et a/.'s method could not obtain any results after running 
their method for over 12 hours, but Ideker et aZ.'s method was able to compute 
results for such network in less than 1 minute. Also Ideker et a/.'s method im- 
proved slightly its results when the redundancy number is increased; this might 
indicate the shortcoming of inferring sparser networks when they are of larger 
size containing redundancies. 
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3.2 Software Availability 

The implementation of Jarrah et. aZ.'s algorithm [13] is available online through 
the web interface provided at http://polymath.vbi.vt.edu/polynome/. The 
implementation of Ideker et aVs algorithm [15] is available online through the 
web interface provided at http://sts.bioengr.uic.edu/causal/. 

4 Concluding Remarks 

In this chapter, we first provided a brief discussion of the biological reverse- 
engineering problem, which is a central problem in systems biology. As a case 
study, we then focused on two methods that rely upon the solution of the "Hit- 
ting Set problem" , but which differ in their approach to solve this problem, thus 
leading to different performance. 

In terms of network inference power, wc hypothesize that, for the smaller 
network, the poor quality of the results when using Jarrah's approach might 
be ascribed to the type of data used: in [13] it is claimed that the method 
performs better if perturbation data is added. The algorithm has the ability of 
considering both wild-type and mutant data to infer the network, and probably 
results would improve if using such additional data. In the case of Ideker et. aL's 
method, in both networks we think that it is possible that the low quality of 
results could be due to lack of ability of using more than one time series at 
a time, as well as the fact that the implementation of the method does not 
include self loops (self-loops are edges connecting a node to itself which may, 
for example, represent degradation terms in biochemical systems). We believe 
that this feature is fundamental for a good performance of the algorithm. 

When comparing the computational efficiency of the approaches, one should 
keep in mind that there will always be a difference between exact solutions 
and approximate solutions based upon greedy algorithms or linear programming 
relaxations. However, since the size of the networks was fairly small, it is possible 
that the reason for which Jarrah's method did not find a solution within a 
reasonable time might lie in encoding issues rather than intrinsic computational 
complexity of the problem. 

Acknowledgments The authors would like to thank Joe Dundas for the im- 
plementation and maintenance of the web tool for Ideker et. al. method. We 
would like to thank as well Dr. Brandilyn Stiglcr for useful discussions on dif- 
ferent aspects of this book chapter. This work was supported in part by grants 
AFOSR FA9550-08, NIH 1R01GM086881, and NSF grants DMS-0614371, DBT 
0543365, IIS-0612044, IIS-0346973 and the DIMACS special focus on Compu- 
tational and Mathematical Epidemiology. 



12 



References 

[1] R. Albert, H. Othmcr. The topology of the regulatory interactions pre- 
dicts the expression pattern of the segment polarity genes in Drosophila 
melanogaster. J. Tlieor. Biol. 223: 1-18 (2003). 

[2] T. Akutsu, S. Miyano, S. Kuhara. Inferring qualitative relations in genetic 
networks and metabolic pathways. Bioinformatics (2000) 16(8): 727-34. 

[3] M.J. Beal, F. Falciani. A Bayesian approach to reconstructing genetic regu- 
latory networks with hidden factors. Bioinformatics 21(3): 349-356 (2005). 

[4] P. Berman, B. DasGupta, E. Sontag. Randomized Approximation Algo- 
rithms for Set Multicover Problems with Applications to Reverse Engineer- 
ing of Protein and Gene Networks. Discrete Applied Mathematics, 155 
(6-7): 733-749 (2007). 

[5] P. Berman, B. DasGupta, E. Sontag. Algorithmic issues in reverse en- 
gineering of protein and gene networks via the Modular Response Analy- 
sis method. Annals of the New York Academy of Sciences, 1115: 132-141 
(2007). 

[6] D. Camacho, P. Vera-Licona, P. Mendes, R. Laubenbacher. Comparison 
of reverse- engineering methods using an in silico network. Proe. NY Acad. 
Sci. 1115(1): 73-89 (2007). 

[7] W.B. Cannon. The wisdom of the body. Norton, New York (1993). 

[8] E. Dimitrova, L. Garcia-Puente, A.S. Jarrah, R. Laubenbacher, B. Stigler, 
M. Stillman, P. Vera-Licona. Parameter estimation for Boolean models of 
biological networks, to appear in Theoretical Computer Science. 

[9] N. Dojer, A. Gambin, A. Mizera, B. Wilczynski, J. Tiuryn. Applying dy- 
namic Bayesian networks to perturbed gene expression data. BMC Bioin- 
formatics 7(1): 249 (2006). 

[10] N. Friedman, M. Linial, L Nachman, D. Pe'er. Friedman N, Linial M, 
Nachman I, Pe'er D. Using Bayesian Networks to Analyze Expression Data. 
Journal of Computational Biology 7(3-4): 601-620 (2000). 

[11] A. de la Fuente, N. Bing, L Hoeschele, and P. Mendes. Discovery of mean- 
ingful associations in genomic data using partial correlation coefficients. 
Bioinformatics, 20: 3565-74 (2004). 

[12] T.S. Gardner, D. di Bernardo, D. Lorez, J.J. Collins. Inferring Genetic Net- 
works and Identifying Compound Mode of Action via Expression Profiling. 
Science 301(5629): 102-105 (2003). 

[13] A.S. Jarrah, R. Laubenbacher, B. Stigler, M. Stillman. Reverse- engineering 
polynomial dynamical systems. Adv Applied Math 39(4): 477-489 (2007). 



13 



[14] R.M. Karp. Complexity of Computer Computations. Chapter: Reducibility 
among combinatorial problems. New York: Plenmn Press (1972). 

[15] T.E. Idckcr, V. Thorsson, R.M. Karp Discovery of regulatory interactions 
through perturbation: inference and experimental design. Pac. Symp. Bio- 
comput., 305-16 (2000). 

[16] J. Kim, D. Bates, I. Postlcthwaitc, P. Heslop-Harrison, K.H. Cho. Least- 
squares methods for identifying biochemical regulatory networks from noisy 
measurements. BMC Bioinformatics 8(1): 8 (2007). 

[17] B. Krupa. On the Number of Experiments Required to Find the Causal 
Structure of Complex Systems Journal of Theoretical Biology, 219(2): 257- 
267 (2002). 

[18] S. Hoops, S. Sahle, R. Gauges, C. Lee, J. Pahle, N. Simus, M. Singhal, L. 
Xu, P. Mendes, U. Kummer. COPASI - a COmplex PAthway Simulator. 
Bioinformatics 22, 3067-3074 (2006). 

[19] R. Laubenbacher, B. Stigler. A computational algebra approach to the re- 
verse engineering of gene regulatory networks. J Theor Biol. 229, 523-37 
(2004). 

[20] S. Liang, S. Fuhrman, R. Somogyi. Reveal, a general reverse engineering 
algorithm for inference of genetic network architectures. Pac. Symp. Bio- 
comput.: 18-29 (1998). 

[21] S. Martin, Z. Zhang, A. Martino, J.L. Faulon. Boolean Dynamics of Cenetic 
Regulatory Networks Inferred from Microarray Time Series Data. Bioinfor- 
matics 23(7): 866-874 (2007). 

[22] S. Mehra, W.S. Hu, G. Karypisb. A Boolean algorithm for reconstruct- 
ing the structure of regulatory networks. Metabolic Engineering 6(4): 326 
(2004). 

[23] P. Mendes. Biochemistry by numbers: simulation of biochemical pathways 
with Gepasi 3. Trends Biochem. Sci. 22:361-363 (1997). 

[24] N. Nariai, Y. Tamada, S. Imoto, S. Miyano. Estimating gene regulatory 
networks and protein-protein interactions of Saccharomyces cerevisiae from 
multiple genome-wide data. Bioinformatics 21(suppl 2): ii206-212 (2005). 

[25] G. L. Nemhauser. Integer and combinatorial optimization. Wiley, New York 
(1988) 

[26] I. Pournara, L. Wernisch. Reconstruction of gene networks using Bayesian 
learning and manipulation experiments. Bioinformatics 20(17): 2934-2942 
(2004). 



14 



[27] J. Chu, S. Weiss, V. Carey, B. Raby. A graphical model approach for infer- 
ring large-scale networks integrating gene expression and genetic polymor- 
phism. BMC Systems Biology 3(55) (2009). 

[28] J.J. Rice, Y. Tu, G. Stolovitzky. Reconstructing biological networks using 
conditional correlation analysis. Bioinformatics 21(6): 765-773 (2005). 

[29] E. D. Sontag. Inferring dynamic architecture of cellular networks using time 
series of gene expression, protein and metabolite data. Bioinformatics. 20, 
1877-86 (2004). 

[30] E. Sontag, A. Kiyatkin and B.N. Kholodenko. Network reconstruction based 
on steady-state data. Essays in Biochemistry. 45, 161-176 (2008). 

[31] G. Tononi, O. Sporns, G. H. Edclman. Measures of degeneracy and redun- 
dancy in biological networks. PNAS 96 (6): 3257-3262 (1999). 

[32] L. von BertalanfFy. General System Theory. Braziler, New York (1968). 

[33] N. Wiener. Cybernetics or Control and Communication in the Animal and 
the Machine. The MIT Press, Cambridge (1948). 

[34] J. Yu, V. Smith, P. Wang, A. Hartemink, E Jarvis. Advances to Bayesian 
network inference for generating causal networks from observational biolog- 
ical data. Bioinformatics. 20, 3594-603 (2004). 

[35] M. Zou, S. D. Conzen. A new dynamic Bayesian network (DBN) approach 
for identifying gene regulatory networks from time course microarray data. 
Bioinformatics 21(1): 71-79 (2005). 



15 



